-
Notifications
You must be signed in to change notification settings - Fork 19
Remove the Bijectors extension
#219
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
…l into remove_bijectors
|
|
||
| include("normallognormal.jl") | ||
| include("unconstrdist.jl") | ||
| struct Dist{D<:ContinuousMultivariateDistribution} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The content of unconstrdist.jl have been moved here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Benchmark Results
| Benchmark suite | Current: 8c657e0 | Previous: f9d7f0b | Ratio |
|---|---|---|---|
normal/RepGradELBO + STL/meanfield/Zygote |
2215502796 ns |
2640111495.5 ns |
0.84 |
normal/RepGradELBO + STL/meanfield/ReverseDiff |
571883445 ns |
611337145 ns |
0.94 |
normal/RepGradELBO + STL/meanfield/Mooncake |
196000337.5 ns |
246507667 ns |
0.80 |
normal/RepGradELBO + STL/fullrank/Zygote |
1708922306 ns |
2061469121 ns |
0.83 |
normal/RepGradELBO + STL/fullrank/ReverseDiff |
1103359376 ns |
1166722255 ns |
0.95 |
normal/RepGradELBO + STL/fullrank/Mooncake |
485775019 ns |
681311257.5 ns |
0.71 |
normal/RepGradELBO/meanfield/Zygote |
1269167076.5 ns |
1592865651 ns |
0.80 |
normal/RepGradELBO/meanfield/ReverseDiff |
276315120 ns |
304657276 ns |
0.91 |
normal/RepGradELBO/meanfield/Mooncake |
144033159.5 ns |
174167780 ns |
0.83 |
normal/RepGradELBO/fullrank/Zygote |
832644703 ns |
1116928390 ns |
0.75 |
normal/RepGradELBO/fullrank/ReverseDiff |
533858360 ns |
605576908 ns |
0.88 |
normal/RepGradELBO/fullrank/Mooncake |
403687069 ns |
563720905 ns |
0.72 |
This comment was automatically generated by workflow using github-action-benchmark.
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
…l into remove_bijectors
|
AdvancedVI.jl documentation for PR #219 is available at: |
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
|
The updates to the documentation and README have been suppressed for clarity and will be added later once the PR is approved. |
Consider the case where we would like to approximate a constrained target distribution with density $\pi : \mathcal{X} \to \mathbb{R}{> 0}$ with an unconstrained variational approximation with density $q : \mathbb{R}^d \to \mathbb{R}{> 0}$. The canonical way to deal with this, popularized by the ADVI paper1, is to use a$b$ bijective transformation ("Bijectors") $b : \mathbb{R}^d \to \mathcal{X}$ such that $q$ is augmented into $q_{b}$ as
Then
AdvancedVIneeds to solve the problemBut notice that the optimization is, in reality, over$q$ . Therefore, often times,
AdvancedVIneeds access to the underlyingq. I will refer to this as the "primal" scheme.Previously, this was done by giving a special treatment to
q <: Bijectors.TransformedDistributionthrough theBijectorsextension. In particular, theBijectorsextension had to add a specialization to a lot of methods that simply unwrap aTransformedDistributionto do something. This behavior is difficult to document and, therefore, wasn't fully explained in the documentation. Furthermore, each of the relevant methods needs to be specialized in the Bijectors extension, which resulted in a multiplicative complexity, especially for unit testing.This, however, is unnecessary. Instead, there exists an equivalent "dual" problem that operates in unconstrained space by approximating the transformed posterior
That is, we can solve the problem
and then post-process the output to retrieve$q_{b^{-1}}^*$ .
Within this context, this PR removes the
Bijectorsextension to fix this problem. Here are the reationals:AdvancedVIdoesn't need to implement the primal scheme. In fact, the upcoming interface inTuringis planned to implement the dual scheme above.KLMinNaturalGradDescent,KLMinWassFwdBwd,FisherMinBatchMatch, for example, do not work in constrained support at all, so they can only be used via the dual scheme. So the way thatKLMinRepGradDescentand friends implemented the primal scheme is a bit redundant in terms of consistency at this point.Instead, a tutorial has been added to the documentation on how to use VI with constrained supports via the dual scheme.
Footnotes
Kucukelbir, A., Tran, D., Ranganath, R., Gelman, A., & Blei, D. M. (2017). Automatic differentiation variational inference. Journal of machine learning research, 18(14), 1-45. ↩